Crime Rate Against Women

Data from 'Data.gov.in'

File names:

>>crcCAW.CSV : `Number of Crime against women`
>>pacCAW.CSV : 'Number of Person Involved in Crime against Women'

In [6]:
# Import Necessary Libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
% matplotlib inline

In [231]:
crimes = pd.read_csv('crcCAW.csv')

In [232]:
crimes.head()


Out[232]:
STATE/UT CRIME HEAD 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
0 ANDHRA PRADESH RAPE 871 1002 946 1016 935 1049 1070 1257 1188 1362 1442 1341
1 ARUNACHAL PRADESH RAPE 33 38 31 42 35 37 48 42 59 47 42 46
2 ASSAM RAPE 817 970 1095 1171 1238 1244 1437 1438 1631 1721 1700 1716
3 BIHAR RAPE 888 1040 985 1390 1147 1232 1555 1302 929 795 934 927
4 CHHATTISGARH RAPE 959 992 898 969 990 995 982 978 976 1012 1053 1034

In [25]:
crimes.plot()


Out[25]:
<matplotlib.axes._subplots.AxesSubplot at 0xc0d7250>

The above Plot is not Satisfactory and is not accurate so lets look at data once again


In [28]:
crimes.head()


Out[28]:
STATE/UT CRIME HEAD 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
0 ANDHRA PRADESH RAPE 871 1002 946 1016 935 1049 1070 1257 1188 1362 1442 1341
1 ARUNACHAL PRADESH RAPE 33 38 31 42 35 37 48 42 59 47 42 46
2 ASSAM RAPE 817 970 1095 1171 1238 1244 1437 1438 1631 1721 1700 1716
3 BIHAR RAPE 888 1040 985 1390 1147 1232 1555 1302 929 795 934 927
4 CHHATTISGARH RAPE 959 992 898 969 990 995 982 978 976 1012 1053 1034
Lets try Something different :
first extract Rape data in new dataframe
Note: Rape data in first 29 rows 

In [34]:
crimes.values


Out[34]:
array([['ANDHRA PRADESH', 'RAPE', 871L, ..., 1362L, 1442L, 1341L],
       ['ARUNACHAL PRADESH', 'RAPE', 33L, ..., 47L, 42L, 46L],
       ['ASSAM', 'RAPE', 817L, ..., 1721L, 1700L, 1716L],
       ..., 
       ['PUDUCHERRY', 'TOTAL CRIMES AGAINST WOMEN', 119L, ..., 115L, 89L,
        61L],
       ['TOTAL (UTs)', 'TOTAL CRIMES AGAINST WOMEN', 2623L, ..., 4904L,
        5559L, 6339L],
       ['TOTAL (ALL-INDIA)', 'TOTAL CRIMES AGAINST WOMEN', 143795L, ...,
        213585L, 228650L, 244270L]], dtype=object)

Lets try to sum up the all the crimes


In [47]:
year_wise_crime = crimes.ix[:,2:].sum(axes=1)

In [59]:
year_wise_crime


Out[59]:
2001     862770
2002     858204
2003     843606
2004     925998
2005     933318
2006     988590
2007    1111872
2008    1175142
2009    1222824
2010    1281510
2011    1371900
2012    1465620
dtype: int64

In [65]:
year_wise_crime.plot(kind='bar', figsize=(8,5), legend=True, title='Number of Crimes per Year',)


Out[65]:
<matplotlib.axes._subplots.AxesSubplot at 0x12471a90>

Yet another plot using Line


In [60]:
year_wise_crime.plot(kind='line')


Out[60]:
<matplotlib.axes._subplots.AxesSubplot at 0x120462f0>

Making it more Beautful


In [121]:
year_wise_crime.columns = ['Crimes']
year_wise_crime
year_wise_crime.index.values


Out[121]:
array(['2001', '2002', '2003', '2004', '2005', '2006', '2007', '2008',
       '2009', '2010', '2011', '2012'], dtype=object)

In [126]:
year_wise_crime.plot(figsize=(10,5))


Out[126]:
<matplotlib.axes._subplots.AxesSubplot at 0x1835aa70>

In [94]:
year_wise_crime.plot(kind='bar')


Out[94]:
<matplotlib.axes._subplots.AxesSubplot at 0x11296090>

In [103]:
year_wise_crime.plot(kind='barh', figsize=(10,5))


Out[103]:
<matplotlib.axes._subplots.AxesSubplot at 0x1457e570>

Now the Problem with above data is we have also summed up the total values did you notice the Total row name in Dataframe

Now we will try to get only rape data


In [233]:
crimes.head()


Out[233]:
STATE/UT CRIME HEAD 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
0 ANDHRA PRADESH RAPE 871 1002 946 1016 935 1049 1070 1257 1188 1362 1442 1341
1 ARUNACHAL PRADESH RAPE 33 38 31 42 35 37 48 42 59 47 42 46
2 ASSAM RAPE 817 970 1095 1171 1238 1244 1437 1438 1631 1721 1700 1716
3 BIHAR RAPE 888 1040 985 1390 1147 1232 1555 1302 929 795 934 927
4 CHHATTISGARH RAPE 959 992 898 969 990 995 982 978 976 1012 1053 1034

In [234]:
rape = crimes[crimes['CRIME HEAD'] == 'RAPE']
rape


Out[234]:
STATE/UT CRIME HEAD 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
0 ANDHRA PRADESH RAPE 871 1002 946 1016 935 1049 1070 1257 1188 1362 1442 1341
1 ARUNACHAL PRADESH RAPE 33 38 31 42 35 37 48 42 59 47 42 46
2 ASSAM RAPE 817 970 1095 1171 1238 1244 1437 1438 1631 1721 1700 1716
3 BIHAR RAPE 888 1040 985 1390 1147 1232 1555 1302 929 795 934 927
4 CHHATTISGARH RAPE 959 992 898 969 990 995 982 978 976 1012 1053 1034
5 GOA RAPE 12 12 31 37 20 21 20 30 47 36 29 55
6 GUJARAT RAPE 286 267 236 339 324 354 316 374 433 408 439 473
7 HARYANA RAPE 398 361 353 386 461 608 488 631 603 720 733 668
8 HIMACHAL PRADESH RAPE 124 137 126 153 141 113 159 157 183 160 168 183
9 JAMMU & KASHMIR RAPE 169 192 211 218 201 250 288 219 237 245 277 303
10 JHARKHAND RAPE 567 797 712 797 753 799 855 791 719 773 784 812
11 KARNATAKA RAPE 293 292 321 291 343 400 436 446 509 586 636 621
12 KERALA RAPE 562 499 394 480 478 601 512 568 568 634 1132 1019
13 MADHYA PRADESH RAPE 2851 2891 2738 2875 2921 2900 3010 2937 2998 3135 3406 3425
14 MAHARASHTRA RAPE 1302 1352 1268 1388 1545 1500 1451 1558 1483 1599 1701 1839
15 MANIPUR RAPE 20 14 18 31 25 40 20 38 31 34 53 63
16 MEGHALAYA RAPE 26 38 40 54 63 74 82 88 112 149 130 164
17 MIZORAM RAPE 52 76 54 20 37 72 83 77 83 92 77 103
18 NAGALAND RAPE 17 17 14 18 17 23 13 19 22 16 23 21
19 ODISHA RAPE 790 691 725 770 799 985 939 1113 1023 1025 1112 1458
20 PUNJAB RAPE 298 299 380 390 398 442 519 517 511 546 479 680
21 RAJASTHAN RAPE 1049 1051 1050 1038 993 1085 1238 1355 1519 1571 1800 2049
22 SIKKIM RAPE 8 6 10 3 18 20 24 20 18 18 16 34
23 TAMIL NADU RAPE 423 534 557 618 571 457 523 573 596 686 677 737
24 TRIPURA RAPE 102 108 114 160 162 189 157 204 190 238 205 229
25 UTTAR PRADESH RAPE 1958 1415 911 1397 1217 1314 1648 1871 1759 1563 2042 1963
26 UTTARAKHAND RAPE 74 89 107 115 133 147 117 87 111 121 129 148
27 WEST BENGAL RAPE 709 759 1002 1475 1686 1731 2106 2263 2336 2311 2363 2046
28 TOTAL (STATES) RAPE 15658 15939 15327 17641 17651 18682 20096 20953 20874 21603 23582 24157
29 A & N ISLANDS RAPE 3 2 2 10 4 6 3 12 18 24 13 12
30 CHANDIGARH RAPE 18 18 18 19 33 19 22 20 29 31 27 27
31 D & N HAVELI RAPE 6 4 1 7 5 6 7 6 4 3 4 3
32 DAMAN & DIU RAPE 0 0 5 1 2 3 1 0 1 1 1 5
33 DELHI RAPE 381 403 490 551 658 623 598 466 469 507 572 706
34 LAKSHADWEEP RAPE 0 1 2 0 0 0 1 2 1 0 0 0
35 PUDUCHERRY RAPE 9 6 2 4 6 9 9 8 1 3 7 13
36 TOTAL (UTs) RAPE 417 434 520 592 708 666 641 514 523 569 624 766
37 TOTAL (ALL-INDIA) RAPE 16075 16373 15847 18233 18359 19348 20737 21467 21397 22172 24206 24923

In [149]:
totalrape = rape[rape['STATE/UT'] == 'TOTAL (ALL-INDIA)']
totalrape


Out[149]:
STATE/UT CRIME HEAD 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
37 TOTAL (ALL-INDIA) RAPE 16075 16373 15847 18233 18359 19348 20737 21467 21397 22172 24206 24923

In [185]:
#df1.iloc[1:5,2:4]
totalrape2 = totalrape.iloc[:,2:].T

In [190]:
totalrape2
totalrape2.index.name = 'Year'
totalrape2.columns = ['No. of Crimes']
totalrape2


Out[190]:
No. of Crimes
Year
2001 16075
2002 16373
2003 15847
2004 18233
2005 18359
2006 19348
2007 20737
2008 21467
2009 21397
2010 22172
2011 24206
2012 24923

In [235]:
totalrape2.plot(figsize=(8,5), title='Number of Rapes per Year\n in India')


Out[235]:
<matplotlib.axes._subplots.AxesSubplot at 0x18b92390>

The above is the final Diagram on rape in india


In [239]:
totalcrimes = crimes[crimes['CRIME HEAD'] == 'TOTAL CRIMES AGAINST WOMEN']
totalcrimes.head()


Out[239]:
STATE/UT CRIME HEAD 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
418 ANDHRA PRADESH TOTAL CRIMES AGAINST WOMEN 16477 18880 18382 18921 20819 21484 24738 24111 25569 27244 28246 28171
419 ARUNACHAL PRADESH TOTAL CRIMES AGAINST WOMEN 180 159 139 148 150 168 185 175 164 190 171 201
420 ASSAM TOTAL CRIMES AGAINST WOMEN 4243 5092 5312 5700 6027 6801 6844 8122 9721 11555 11503 13544
421 BIHAR TOTAL CRIMES AGAINST WOMEN 5356 5743 5900 8091 6019 6740 7548 8662 8803 8471 10231 11229
422 CHHATTISGARH TOTAL CRIMES AGAINST WOMEN 3989 3538 3336 3763 3599 3757 3775 3962 4002 4176 4219 4228

In [240]:
totalcrimes1 = totalcrimes[totalcrimes['STATE/UT'] == 'TOTAL (ALL-INDIA)']

In [241]:
totalcrimes1


Out[241]:
STATE/UT CRIME HEAD 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
455 TOTAL (ALL-INDIA) TOTAL CRIMES AGAINST WOMEN 143795 143034 140601 154333 155553 164765 185312 195857 203804 213585 228650 244270

In [245]:
finaltotalcrimes = totalcrimes1.iloc[:,2:]
finaltotalcrimes


Out[245]:
2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
455 143795 143034 140601 154333 155553 164765 185312 195857 203804 213585 228650 244270

In [246]:
finaltotalcrimes = finaltotalcrimes.T

In [249]:
finaltotalcrimes.index.name = 'Year'
finaltotalcrimes.columns = ['TOTAL CRIMES AGAINST WOMEN']
finaltotalcrimes


Out[249]:
TOTAL CRIMES AGAINST WOMEN
Year
2001 143795
2002 143034
2003 140601
2004 154333
2005 155553
2006 164765
2007 185312
2008 195857
2009 203804
2010 213585
2011 228650
2012 244270

In [252]:
plt  = finaltotalcrimes.plot(figsize=(7,5), title='Total Crimes Against Women per Year\n In India ')



In [254]:
bar  = finaltotalcrimes.plot(kind = 'bar',figsize=(8,5), title='Total Crimes Against Women per Year\n In India ')



In [ ]:
#to drop rows ::df.drop(df.index[[1,3]])